Mariposa County, CA - home of Yosemite Valley and the impressive Half Dome geologic feature. The transition in vegetation communities between the Central Valley (southwest) and the high Sierra (northeast) is significant enough to be clearly observed with Landsat imagery.

Mariposa County, CA - home of Yosemite Valley and the impressive Half Dome geologic feature. The transition in vegetation communities between the Central Valley (southwest) and the high Sierra (northeast) is significant enough to be clearly observed with Landsat imagery.

I’ve noticed that many of the broad-scale analyses I run using Google Earth Engine frequently lead to error messages like “Too many pixels in the region” and “Computation timed out.” Often, the workaround for quick visualization is setting a scaling factor that doesn’t make computing requirements on Google’s end go nuts. That means, when we look at graphs through ui.Chart.image everything is analyzed across a scaled version of the original raster product.

So, how do we get Google to process and export large raster datasets and not run into scaling issues along the way? In this case, less is more, every time. Using ui.Chart and Map.addLayer() is where we run into problems. Those are great functions if you want to check your progress along the way, but once you know you have what you want, get rid of all the steps that don’t directly work into the data pipeline of finding imagery, processing imagery, and exporting imagery. Even if mapping steps don’t lead to errors, they do take a long time for large datasets.

Let’s take California as an example. If I want to use a high-definition image of California in a presentation (and I do), the workflow should include:

-Identify region of interest -Create masks to remove clouds and water -Select data within a timeframe of interest -Apply masks -Calculate target pixel values -Export image

Disclaimer: You need to be a registered Google Earth Engine user to do this. They have a request system that in my experience is very quick to respond, but be aware that it may take a few days to register.

So, to take things step-by-step: First, we’ll be using javaScript - it was new to me when I started using Earth Engine. I found Google’s tutorials super helpful for navigating a new language. Begin by declaring the region of interest (in this case, California).

var cali = ee.FeatureCollection('ft:1fRY18cjsHzDgGiJiS2nnpUU3v9JPDc2HNaR7Xk8')
  .filter(ee.Filter.eq('Name', 'California'));
print("California", cali);

Then, define a function to calculate the normalized difference water index , which will be used as a mask.

// This function gets a water index from a Landsat 8 image.
var addNDWI = function(image) {
  return image.addBands(image.normalizedDifference(['B3', 'B5']).rename('NDWI'));
};
print("addNDWI function",addNDWI);

And create a mask that will remove cloudy imagery.

// This function masks cloudy pixels.
var cloudMask = function(image) {
  var clouds = ee.Algorithms.Landsat.simpleCloudScore(image).select(['cloud']);
  return image.updateMask(clouds.lte(3));
};
print("cloudMask function",cloudMask);

// This function clips imagery to an ROI.
var clipper = function(image){
  return image.clip(cali);
};

Now, we can move on to accessing the data we want. Here, I map the functions from up above to the image collection, removing data outside the region of interest, masking out cloudy images, and creating an NDWI band for each image.

// Load a raw Landsat scene and display it.
// Here, we're going to create the median LS8 image
// spanning from the beginning of 2015 to the end 
// of 2017.
var raw = ee.ImageCollection('LANDSAT/LC08/C01/T1')
  .filterDate("2015-01-01","2017-12-31")
  .map(clipper)
  .map(cloudMask)
  .map(addNDWI);
print("Raw image collection", raw);

Calculate the target pixel value and mask out regions classified as water. Here, I want to use the median value from each pixel to produce a median California. Earth Engine has a good tutorial on creating a greenest pixel composite, which also requires NDVI band - but I don’t want a greenest pixel composite so won’t be using it here. For more on compositing check out this link.

// Calculate median value for each band for each pixel
var raw_median = raw.median();
print("median image",raw_median);
// Filter out pixels that can be classified as "water"
var mask = raw_median.select("NDWI").lte(0);
print("water mask", mask);
var maskedComposite = raw_median.updateMask(mask);
print("masked Composite",maskedComposite);

Finally, create an RGB composite of the image and export it to Google Drive. It’s going to have a large file size and consequently it’ll take a long time to download, so grab a good book.

// Create an RGB version of the image
var imageRGB = maskedComposite.visualize({bands: ['B4', 'B3', 'B2'], min: 6000, max: 20000});

// Export the image, specifying scale and region.
// Scale = number of meters on each side of pixel
// be sure to set maxPixels to facilitate full export
Export.image.toDrive({
  image: imageRGB.clip(cali),
  description: 'California_LS8_RGB',
  scale: 30,
  region: cali.geometry().bounds(),
  maxPixels: 1e13});

Confirm the task and create a directory to export the raster image. Depending on the size of your image, Earth Engine may break it up into tiles, but don’t worry: they’re georeferenced, so you can mosaic everything together on your own computer quite easily.

The level-3 Sierra Nevada ecoregion. Notice that none of the maps here include a scale bar: If you decide to include some more material for your audience to orient themselves, add some frame of reference using raster::scalebar(), SDMTools::Scalebar(), or the graticule package.

The level-3 Sierra Nevada ecoregion. Notice that none of the maps here include a scale bar: If you decide to include some more material for your audience to orient themselves, add some frame of reference using raster::scalebar(), SDMTools::Scalebar(), or the graticule package.

Eventually, the task will be marked completed. If you’ve made it this far, there should be a file (or series of files) in the new subdirectory you just created on Google Drive. Download the entire subdirectory, and proceed with final raster processing. To do this in R, load the relevant libraries, load relevant shapefiles, choose an appropriate coordinate window, and plot the data. For US state borders, I used https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html. Raster functions in R aren’t super efficient, so one of these days I may try to put a similar approach together for Python, but for the time being the R approach works well enough for my needs.

# Load the libraries
library(raster)
library(rgdal)

# Load region of interest
states <- readOGR("/PATH/TO/US_STATES", "cb_2016_us_state_20m")
CA = states[states@data$NAME == "California",]

# Load Landsat data from GEE export - note that this export led to the creation of two tiles.
LS8.1 = stack("/PATH/TO/DL/California_LS8_RGB-0000000000-0000000000.tif")
LS8.2 = stack("/PATH/TO/DL/California_LS8_RGB-0000000000-0000037888.tif")

# Transform the ROI to match CRS of raster
CA = spTransform(CA, CRSobj = crs(LS8.1))

# Plot the data
jpeg("CA_LS8_RGB.jpg",
    width = 32, height = 40, units = "in", res = 300)
par(bg = "black",
    mai=c(2,2,2,2))
plot(CA, border="transparent")
plotRGB(LS8.1, add=T,
        maxpixels=120000000)
plotRGB(LS8.2, add=T,
        maxpixels=120000000)
dev.off()

And finally you should end up with an image that looks something like the one below. A 32x40" image with 300 pixels per inch has pixel dimensions 9600x12000. Saved as a jpg, that’s a ~ 13MB image.

California from space - since we used a high maxpixels argument, raster::plotRGB()  is left with a lot of pixels to work with (by default the maxpixels argument is much smaller and plotRGB() will consequently export a low-resolution image).

California from space - since we used a high maxpixels argument, raster::plotRGB() is left with a lot of pixels to work with (by default the maxpixels argument is much smaller and plotRGB() will consequently export a low-resolution image).